Data Balancing Techniques for Predicting Student Dropout Using Machine Learning
نویسندگان
چکیده
Predicting student dropout is a challenging problem in the education sector. This due to an imbalance data, mainly because number of registered students always higher than students. Developing model without taking data issue into account may lead ungeneralized model. In this study, different balancing techniques were applied improve prediction accuracy minority class while maintaining satisfactory overall classification performance. Random Over Sampling, Under Synthetic Minority SMOTE with Edited Nearest Neighbor and Tomek links tested, along three popular models: Logistic Regression, Forest, Multi-Layer Perceptron. Publicly accessible datasets from Tanzania India used evaluate effectiveness models. The results indicate that achieved best performance on 10-fold holdout sample. Furthermore, Regression correctly classified largest (57348 for Uwezo dataset 13430 dataset) using confusion matrix as evaluation matrix. applications these models allow precise at-risk reduction rates.
منابع مشابه
Preventing Student Dropout in Distance Learning Using Machine Learning Techniques
Student dropout occurs quite often in universities providing distance education. The scope of this research is to study whether the usage of machine learning techniques can be useful in dealing with this problem. Subsequently, an attempt was made to identifying the most appropriate learning algorithm for the prediction of students' dropout. A number of experiments have taken place with data pro...
متن کاملPrediction of Student Learning Styles using Data Mining Techniques
This paper focuses on the prediction of student learning styles using data mining techniques within their institutions. This prediction was aimed at finding out how different learning styles are achieved within learning environments which are specifically influenced by already existing factors. These learning styles, have been affected by different factors that are mainly engraved and found wit...
متن کاملPredicting College Students Dropout using EDM Techniques
This study examines the factors affecting students’ academic performance that contribute to the prediction of their failure and dropout using educational data mining techniques. This paper suggests the use of various classification techniques to identify the weak students who are likely to perform poorly in their academics. WEKA, an open source data mining tool was used to evaluate the attribut...
متن کاملMachine Learning Models for Housing Prices Forecasting using Registration Data
This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...
متن کاملUsing Three Machine Learning Techniques for Predicting Breast Cancer Recurrence
Introduction Breast cancer (BC) is the most common cancer in women, affecting about 10% of all women at some stages of their life. In recent years, the incidence rate keeps increasing and data show that the survival rate is 88% after five years from diagnosis and 80% after 10 years from diagnosis [1]. Early prediction of breast cancer is one of the most crucial works in the follow-up process. D...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Data
سال: 2023
ISSN: ['2306-5729']
DOI: https://doi.org/10.3390/data8030049